Comparison of symbolic and connectionist approaches to eliminate coarticulation effects in phonemic speech recognition
نویسندگان
چکیده
Two methods to correct phonemic transcriptions produced by the acoustic processor of a speech recognition system are described and compared. The first method that was invented by Prof. Teuvo Kohonen and named the Dynamically Expanding Context (DEC), involves a large set of error-correcting rules automatically constructed from examples. This symbolic approach is compared with a connectionist one, which employs a multi-layered feed-forward network trained with back propagation. Our experiments demonstrate that the latter paradigm is far from optimal when context-dependent mapping (e.g. correction) from one set of symbol strings to another is desired. The DEC-method is shown to have better correction capabilities. Beside this, the trainingtime required by DEC is a fraction of that required by back propagation.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملModularity and Scaling in Large Phonemic Neural Networks
Scaling connectionist models to larger connectionist systems is difficult because larger networks require increasing amounts of training time and data, and the complexity of the optimization task quickly reaches computationally unmanageable proportions. In this paper, we train several small Time-Delay Neural Networks aimed at all phonemic subcategories (nasals, fricatives, etc.) and report exce...
متن کاملLexical activation (and other factors) can mediate compensation for coarticulation
A key dispute in theories of spoken word recognition is whether activation of a lexical representation can affect the perception of sublexical components, such as phonemes. Elman and McClelland (1988) provided evidence for such top– down processing by showing that a prelexical process (compensation for coarticulation) could be affected by lexical activation. However, Pitt and McQueen (1998) rep...
متن کاملAcoustic modeling for spontaneous speech recognition using syllable dependent models
This paper proposes a syllable context dependent model for spontaneous speech recognition. It is generally assumed that, since spontaneous speech is greatly affected by coarticulation, an acoustic model featuring a longer range phonemic context is required to achieve a high degree of recognition accuracy. This motivated the authors to investigate a tri-syllable model that takes differences in t...
متن کاملA Connectionist Expert Approach
Artificial Neural Networks (ANNs) are widely and successfully used in speech recognition, but still many limitations are inherited to their topologies and learning style. In an attempt to overcome these limitations, we combine in a speech recognition hybrid system the pattern processing of ANNs and the logical inferencing of symbolic approaches. In particular, we are interested in the Connectio...
متن کامل